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This  paper  considers  a class  of  variable  metric  methods  for  uncon:  ra.\ 
minimization.  Without  requiring  exact  line  searches  it  is  shown  that,  und 
appropriate  assumptions  on  the  function  to  be  minimized,  each  algorithm  m 
class  converges  globally  and  superlinearly . 
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SIGNIFICANCE  AND  EXPLANATION 


Many  practical  problems  in  operations  research  may  be  reduced  to  minimizing 
a function  with  or  without  contraints.  By  means  of  penalty  functions  and  similar 
techniques  a constrained  minimization  problem  can  be  converted 
unconstrained  minimization  problems.  In  this  paper  we  discuss 
for  unconstrained  minimization  problems  which  converge  rapidly 
a starting  point  which  is  not  necessarily  a good  approximation 
the  given  problem. 
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GLOBAL  AND  SUPERLINEAR  CONVERGENCE 
OF  A CLASS  OF  VARIABLE  METRIC  METHODS 


Klaus  Ritter 


1 . Introduction 

Variable  metric  methods  have  been  used  successfully  for  iteratively  calcul<itin<i  .it.  .ii  iu. ■vi- 
olation to  the  least  value  of  a function  F(x)  of  n variables.  A variable  metric  method 
simultaneously  generates  a sequence  of  points  (x^l  and  a sequence  of  matrices  {H^!.  During 

each  iteration  a correction  matrix  of  rank  one  or  two  is  added  to  H.  with  the  intent  to  con- 

1 

struct  an  approximation  to  the  inverse  Hessian  matrix  of  F(x). 

A large  class  of  such  methods  has  been  introduced  by  Huang  |9|  . This  class  contains  sym- 


metric and  unsvmmetric  matrices  H^ . A restriction  of  the  Huang  class  to  update  formulas  which 
are  of  rank  two,  satisfy  the  guasi-Newton  equation  and  maintain  the  symmetry  of  H.  leads  to  a 
class  of  methods  proposed  by  Broyden  (1|  and  Fletcher  (7).  Two  well-know  members  of  this  class 


are  the  Davidon-Fletcher-Powell-method  (DFP  - method),  (41,  (61,  and  the  Broyden-Fletcher  - 
Goldfarb-Shanno-method  (BFC.S  - method),  (21,  (7),  (8),  (161. 

The  first  general  global  convergence  result  is  due  to  Powell  (121,  (111  who  proved  that , 
if  F(x)  satisfies  certain  assumptions  and  if  the  optimal  step  size  is  vised,  the  DFP  - method 
converges  superl inearly  to  a global  minimizer  of  F(x) . In  (51  Dixon  showed  that  under  certain 
conditions  the  methods  in  the  Huanq  class  generate  the  same  sequence  { x ^ } if  they  are  started 
with  the  same  initial  x^,  H and  if  the  optimal  sten  size  is  used.  Under  the  idealized 
assumption  of  an  optimal  step  size  these  two  results  provide  therefore  a complete  convergence 
theory.  In  practice,  however,  it  is  in  general  not  possible  to  use  an  optimal  step  size.  There- 
fore, it  is  important  to  establish  global  convergence  for  a non-optimal  step  size. 

One  such  result  was  obtained  by  Lenard  (10)  who  generalized  Powell's  convergence  proof  tor 
the  DFP  - method.  Another  result  is  due  to  Powell  |141  who  proved  that  the  8FG8  - method  con- 
verges super  linearly  with  a step  size  procedure  that  eventually  results  in  a step  size  equal  to 


Using  also  a non-optimal  sten  size  Stoer  1 1 7 | showed  that  every  method  in  a subclass  ot  the 


Broyden  class,  the  so-called  restricted  Broyden -methods , converges  n-step  quadrat  ical ly  foi 


■ J 


sufficiently  close 


every  positive  definite  starting  matrix  and  every  initial  value 

to  a minimise*  z of  F(x) . 

If  it  is  assumed  that  both  x4  and  ll_  are  sufficiently  close  to  z and  the  inverse 

0 0 

Hessian  matrix  of  F(x>  at  ?. , respectively,  then  it  follows  from  results  obtained  by  Broyden, 
Dennis  and  More  ( \)  that  the  Dll'  - method  and  the  BFGS  - method  oonverqe  super  1 inearly  to  z 
with  step  size  one . 

It  is  the  purpose  ot  this  paper  to  show  that  with  an  appropriate  non-optimal  step  size 
every  method  in  the  Broyden  class  converges  globally  ami  suner linearly  provided  F(x)  satisfies 


certain  assumptions.  in  the  next  section  we  derive  a representation  of  the  matrix  H.  as  a 
sum  of  n matrices  of  rank  1.  This  representation  allows  us  to  study  the  dependence  of  H. 


on  the  parameters  used  in  the  update  formula  for  H,  and  loads  to  a simple  proof  of  Dixon's 
result.  In  Section  1 global  convergence  is  established.  The  proof  is  based  on  a general  iz.it  i< 


of  Powell's  ptoof  for  the  BFGS  - method.  In  the  final  section  it  is  shown  that  the  sequence 
(x,t  converges  super  linearly  and  that  the  sequences  {||h,||1  and  { !!  nT 1 1 1 1 are  bounded. 


-•  Basic  properties  of  variable  motile  mot IhkI s 

l,et  x * Kn  and  lot  Fix)  be  a t ea 1 -valued  function.  If  Fix)  is  twice  differentiable 

at  a point  x , wo  denote  the  gradient  and  the  Hessian  matrix  of  Fix)  at  x by  <1  ' VFlx  ) 

l ill 

and  lb  - rllx^),  respect  vvely . A pi  ime  is  used  for  the  trans|x>se  of  a vector  or  a matiix.  For 
any  x . f'.  II  x"  denotes  the  Euclidean  norm  of  x. 

We  consider  the  problem  of  deteimininu  a vector  z such  that 

rill  v e ix)  for  all  x . En  . 

For  latei  reference  we  formulate  the  following  assumption. 

Assumption  1 . 

Fix)  is  a convex  function.  There  exists  an  x,  such  that  t)ie  set 

0 

« (x|F(x)  _*  Flx^l  I 

is  bounded,  and  such  that  Fix'  is  twice  continuously  differentiable  on  some  convex  ooen  sot 
containing  s . 

If  a variable  metric  method  is  used  to  minimize  Fix),  then  at  a given  point  x , a 

1 

search  direction  s^  is  determined  by  multiplying  the  gradient  g,  * i'Flx^)  by  a appropriate 

matrix  H . , i ,e.  , 

1 

*1  • Vi  - 

whore  H is  an  approximation  to  the  inverse  Hessian  matrix  of  Fix)  at  x . With  a suitable 
1 1 


step  size 


a new  point 


X1*l  * X.i  " °j8j 


tj.  comput  ed . if  q.  v*F(x  .)  * 0,  the  matiix  H , is  determined  from  M.  in  such  a wav 
1 ♦ 1 1 ♦ 1 t ♦ l S 


that  the  oua;  i-Newton  equation  is  satisfied,  i.e., 


Hj4l(,,1  - Vl’  ■ Vi 


The  vat  ions  vat  i able  metric  methods  diffoi  in  the  update  proceduio  which  is  used  to  compute 
from  H ^ . In  many  methods  is  obtained  by  adding  one  or  two  matrices  of  tank  one 


tv>  M . A latvjo  class  v>t  such  methods  has  boon  studied  by  Him  no  lv>1  and  Dixon  lr>l  . With 


d - u . ^ s 

vt  • ■**-  and  o.  - ,1 — , 

1 i 


then  update  formula  I'M  he  written  .i»  fiillows: 

a. 2) 


H»  - n-  . J--  ’ J i J-  J - J 1 J • J J , 

',l  ’ <WWV*J 


where  o , a,,  and  S,  jii>  parameters  such  that  a*  ♦ n • 0 and  r'  ♦ |t  - 0 and  it 

is  a imed  that  tin-  denominator  a ate  not  aero. 

The  equation  i . I 1 is  satisfied  if  and  only  'f  >'  * 1.  Therefore,  we  shall  always  assume 
that  1.  Utwlei  suitable  assumptions  the  inverse  Hessian  matrix  of  Fix'  i«  symmottiv. 

■•liter  is  intended  to  be  an  approximat  ion  to  this  matiix  it  is  teasonable  to  restrict  the 


paiameteis  in  snoh  a wav  that  is  symmetric  tot  all  1.  With  - l wo  obtain  from  i.'..'' 


It*  , • H*  ♦ 
i«i  1 (a 


’v';  * 


i rj 

*V'  ’ * I ' (r,o olMl/'d,  WA  • 


11  2 11  1 


Thus,  it  M is  symmetric  then  H , is  symmetric  if  and  onlv  it 
1 1*1  ’ T 


l • . 4 ' 


•vwi  * vwv  ■ -w;di  * vwv 


As sum t no  s ^ * 0 we  can  solve  (i,4'  foi  a^.  This  oives 


‘Sv'd  H,id:n. d, 


, . JL'J.  1 V . -1  .J  J J 
1 1 vr, 


Thn  *'t  »'i  r , 


v\ , 


"ipidi  * v;Vi  * * s|  * WiV 


.1  rM  ♦ .*  ,1'n  d r\l  (!'  v d Mi  d M d ' 

\'\  \ ' \ \ \ 1 rrMMM  ' rr i 


.Old 


Substitution  into  (2.11  .lives  thou  tbe  upilato  foimula  foi  symmctit.  matii. 


■ (phl.tdMi.a.)  .!<  .a;n  a. 


H ■ II  ♦ “ - -j-  — ■ — J--J  p I' 

I*1  * WW/WiV  1 


p,  a’H.tir  a.o- 


it  a a*n 


l'i  s.p'a  »s',a*M  a " *2  s'p'a  *s'*i'n  a 

ill  i ) H ill  -iii 


Tho  update  formula  (2.M  i epresent  s the  subclass  of  the  lluano  class  ct  moo. 
tho  property  that  all  matrices  II  ^ aie  symmetric  ana  satisfy  the  quasi -Newt  on  equa 
subclass  is  identical  with  a class  of  update  formulas  obtained  by  Brovdon  l’.)  and  ■ 
form  by  rletchet  I’). 

Kit  st  we  cotisidet  three  special  case.  If  wo  choose  i; ^ - 0 then  l.'.4'  impli< 
and  (2.1)  reduces  to 

,,  . „ + ‘Ifi  . "i'ai'j 

in  3 a;Pj  a-ii.a,  * 

This  Is  the  update  formula  used  in  the  Davidon-Kletohet -Powel 1 - met  lux!  141,  Ibl  . \t' 

And  t*  * 0 we  ohttlin  fit-vm  (2.5) 


pM  . ♦ d 1 H d . p,  d *H  *\\  A 

J-J- . 1J.J  . _J — LJ LJ.  J 


H ,»  II  ♦ • J— *-J  ,, 

*M  1 (pm  i*  11 


i.e.,  the  update  formula  of  the  Hroydon-Flotehor-doldfai b-Shanno  - method  12),  l"l. 
Finally  it  we  choose  t'j  * 1 and  it  % • -1,  then  (2.r'l  becomes 

c p’-i'  d’M  ,-n  a p1  a d'n , 

(a.  ?i  ii  - „ ♦ 1.1  1 1, 1 .XjlX-lIU 

i ♦ l 1 pM.-djM.d, 

' 1 i ill 

. n i 

1 (p’^-d'H.)  a . 


This  is  a symmetric  tank  one  update  formula.  because  the  v»vtois  p \\  d and 
7 r r y ) \ 

ciunc  inoarly)  orthogonal  it  is*  however,  known  to  be  unstable  and  not  tecotnmended  *• 


Returning  to  the  qenersl  foimula  (2.5)  we  assume  that  II  is  positive  doti* Mi 


-s- 


/ 


, . Vi- 
' lIVlH 


tl  o -H  o , 
and  lt  d - -J-.J— 

3 1 "'V;i.. 


we  obseive  that  with 


T ■ x (Ho  .1  ’x  - (Ho  I 1 x » O' 
I I 1 } 1*1 


we  have 
(2. HI 


H , x ” H.x  for  X T 
1*1  1 I 


Since  H is  positive  definite,  o 4 T and  a.  , 4 T . Heine  usino  (.'.HI  we  can  di 
1 1 I i* 1 j 

H , completely  hv  definino  it  on 

1-1 


, ■ spani  o . .0  . ,) 
1 I 1*1 


For  this  purpose  we  write  H.  as  a sume  of  throe  mat v loots . Sett  mo 


Ho. 

-.2-2- 


'Vil 


Ami  ohoosmo  w 


1 1 


i iiVjll  ’ ’3 

such  that  w ' i'  - o and  o * H w,  has  norm  one  wo  hove 


V i 


1 11 


(2.01 


itii,  . 

1 I 


whore  H is  a symmetric  matiix  of  rank  n - 2 with 


and 


H , o . - H w,  « 0 

1 1 11 


H x - H , x for  x.T, 

1 1 1 


Note  that  H . can  he  wi itten  in  the  form 


(2.10) 


V 

i d*  p 
i • ' i i i i 


wliere  d d are  vectors  in  r such  that 

' 1 n 1 1 


ill.ll  J,  ■ 0 i ,k  . ,n,  i * k 

1 1 1 k i 


■t  e inline 


it  follow:',  that,  for  every  choir*  of  the  parameter:  8^  and  H.^w^  is  a vector  in 

u irv.;  p which  is  orthoqonal  to  d . 

let  u be  a vector  such  that 

u.  • span{  q . ,p . ' , llu^l  - 1,  d’u^  » 0,  w.vk  > 0 . 

ir*  o d'n  * 0 and  w t 0 it  follows  that  u exists  and  is  uniquely  determined.  There- 

r 1 i i i 

fore,  usinu  (-.12)  and  (2.1*)  we  have 

, . K ' H , w » u) . u . 

j+i  .1  ii 

whore  ^ is  .i  n umbel  that  depends  on  the  particular  values  of  the  parameters  and  P, 

u-  <■.'  to  determine  M I'ombiniiw  (2.P),  (2.11),  and  (2.14)  wo  see  that 


( 2 . 1 M 


p p*  u u* 

H • — J — ^ ♦ u(  — JL-J  ♦ H 
}♦!  d'p^  j w'u^  1 


thus  ill  matrices  ll  defined  by  (2.51  are  of  the  form  (2.15)  and  differ  only  in  the 

t ■ . Furthermore,  if  H,  is  positive  definite  and  if  dip,  ' 0,  then  H.  is  positii 

) 11  1 + 1 


-7- 


' 


definite  if  and  only  if  w.  '0. 


In  o t ». it' t to  sti My  the  dependence  of  on  the  parameters  rt.  and  tv,  moi  c elos.-l\  w« 


fust  dote  mine  u> . for  the  BFGS  - method.  From  (2.6)  we  obtain 


p d*H  w 

H , w . - H w - * *"* 

j*l  1 J j d*t> 


*j2i  ,, 

1 llipi  1 


qj  ‘ Vi 


where  i - d /d'.p,,  Thus 

1 ) ) j j 


‘i  ♦>\p 


(J.16) 


ui  ■ iTvvtir'  -i  ’ iiqi  * Vi 


Observing  that  by  (.!.>" 


1 7) 


d!H.d  . 


d^v 

>i'>l . 

. ,,  J.J 

0 . >1  ' p . 

) w ,q , 

J j 1 

i 1 

id’p^-’ 

(d!q.)2 

♦ X a 

e . q p , 

w'.q  . 

1 1 1 

1 ) 

we  have  for  the  qeneral  update  formula  (2.f>) 


"1mW'  " ^ " Vi  Ve2d  j V ) J 


i.i  \i  ' 1 


(i\d!p  . e< 'll  >i-i;  , >q 

l r J -lit  t f j J J j j 2 w.q. 


d'p  d q 


.q’p.  1 

1 1 


a'".; 


M 


(d’p .) 

;\  1 1'  — -J.  - 

1 ) i >'  .0  Jp  , 

i.r.i 


. i'J  , 

■ >1.1-,  i i', d t>  «h  d’H.d. 


i 1 


l rj  - i i 1 


Thus 

(2.X8) 


V'l'lWi"  - 


-8- 


j 


(d’pd' 


B , d*.  p . ♦ 6 -.  — ^-r^~ 

1 ] ) 2 P.q'.p, 


J_xi- 


B.d'  p ♦ !'^d  ‘ tl  d 
1 1 3 2 13  3 


and  > » l if  - * 0,  i.e.,  for  the  BFGS  - method.  For  the  DFP  - method  we  have 

3 2 


12.20) 


(d‘p,)' 


'j  o g’p.d'.H.d. 


3 ' 3'  3 3 i 3 


Assuming  that  d‘  p 0 and  H.  is  txisitive  definite  we  see  that  the  subset  oi  t n< 
3 3 3 


dating  formulas  (2.5)  with 


fjBj  1 0,  Bj  + * 0 


preserves  the  positive  definiteness  of  H . . More  generally  we  have  the  following  tesu.t 


! .enuii.i 

l,et  H,  be  a symmett  ic  positive  definite  matrix  and  assume  that,  for  every  3,  !'i 


and  H , is  determined  by  (2.5).  Then  H.  . is  positive  definite  for  every  j m 

1*1  3 + 1 


if  at  least  one  of  the  following  two  conditions  is  satisfied. 

i)  » 0,  tj  » M 0 


d'.p . d ' H . d . 

li’  (8i 4 s2  1 + 8j  ' 0 . 

3 3 3 3 3 


Proof  : 

Observinq  that  by  (2.15)  H 
see  that  the  lemma  follows  from  (2. IB)  and  12.19). 


is  positive  definite  if  and  only  if  ■■  ’ 0 w* 

3 + 1 3 


From  (2.1’)  and  (2.1")  we  obtain 


e,d:p.+e_d!H.d.-e,  — -H- 

1  3 3 2 ] J 3 2 w (q  j 

Yj  = 8 dip  ,*8  d'.H.d  . 

1 ] ] 2]]] 


(div 
8,  — 3,  J 

2 w'.q. 

2_3 

f.d'.p,  +8  d'.H  .d  . 
1 3 3 2 ] ] ] 


if  ~ then  Y j = 1 and,  by  (2.16),  Uj  = q^  and  = 1.  Therefore,  it  follows  from 

(2.15)  that  in  this  case  Hj+1  is  independent  of  the  parameters  6^  and  Since 

d.q  - (93~g3^i)q3  .-q3+iqi 
j j ll°jsjll  ll°jsjll 

this  happens  if  and  only  if  and  are  parallel.  Excluding  this  case  we  have  the 

following  lemma. 


Let  H,  be  positive  definite  and  suppose  that  dip.  > 0 and  d’q  ^ 0.  If 
2 3 13  3 

8 dip.  + 8 d'.H.d.  0,  then 

1 3 3 2 ]]]r 


i)  Yj  = 1 if  and  only  if  8^  = 0 


ii)  y-  > 1 if  and  only  if  — — — < 0 

3 6ldjP3+B2djHjdj 


ill)  0 < y.  < 1 if  and  only  if  B.  + e — > o and  either  B_  > 0 or 
3 1 2 p.g'.p.  2 

3 3 3 

B_  < 0 and  B.d'.p.  + 8 ..d'.H.d.  < o . 

2 1 2 j j 3 


The  first  statement  of  the  lemma  follows  immediately  from  (2.21).  Let  82  / 0.  Suppose 

first  that  B.d'.p.  + B„d!H.d.  > 0.  By  (2.21)  we  have  y.  > 1 if  B <0  and  y,  < 1 if 
1 3 3 “■  3 3 3 3 2 j 

B2  > 0 in  which  case  it  follows  from  (2.19)  that  Yj  > 0 if  and  only  if  B^P.gjP.  + Bjdlp.  > 0. 

Next  let  B.d'.p.  + B.d'.H.d.  < 0.  Then  B_  > 0 implies  y.  > 1 and  8,  < 0 implies  y.  < 1 
*■33^333  * j 2 j 


-10- 


with 


1 if  and  only  if  + B2dj“j  > °’  since  SjdjPj  * Sjd^H^  ' 0 8^  '■  0 

and  B.r.q'.p.  + B.d'.p.  > 0,  this  completes  the  proof  of  the  lemma. 

1 I 1 5 2 jr  1 

The  above  lemma  shows  that  all  update  formulas  (2.5)  with 

V<2  -°’  *1  * B2  * ° 

in  addition  to  preserving  the  positive  definiteness  of  II produce  a y.  with  0 < y^  * 1. 

Let  > and  > denote  the  value  of  v . that  corresponds  to  the  PFP  - method  and  the  RFGS  - 

i i 1 

method,  respectively.  It  is  interesting  to  observe  that,  if  djp^  > 0 and  d'q^  ^ 0,  (2.21) 

implie s 

° - 7j  <>,  < - i 

for  every  > . correspondina  to  an  update  formula  (2.5)  with 

iv2  * 0 • 

For  t !h  results  obtained  so  far  we  have  only  assumed  that  o is  chosen  in  such  a way 


1 ' • 

j j 


i.e.,  g ' .p.  g'p  . Now  we  assume  that  e.  is  the  optimal  step  size;  more 
J+l  J 1 1 1 1 


r s • t ;ely  .et  a be  the  smallest  value  of  o such  that 
) 


F ( x 


1 . s . ) * m i n 1 F ( x . - as.)  a • 0 } 
3 3 3 3 


v xn  , p.  0 anvl  it  follows  from  the  definition  of  w.  that 

)*i  ) 1 


(.'  22) 


W 

3 


^ . A ,n  . . . where  \ . . 
j ♦ 1 i + 1 3 4 1 


i-i 


"’here  fore  , 10  becomes 


'Y’i  Vi 

ii.  , - v . — ♦ ii. 

i+1  diP;i  3 X1  + ,q;  + lUj  1 


and 

(2.23) 


Vi  ‘ ViVi  ' ta,j  T 


i+i 


r.e  t he  a*  h directions  at  x.^  computed  by  any  of  the  matrices  (2.5)  differ  only  in  the 


t actor 


Tm  . b . rvat  ion  suggests  a simple  proof  for  a theorem  duo  to  Dixon  (51  which 


-l  l- 


essentially  states  that,  if  the  optimal  step  size  is  used,  all  members  of  the  class  (2.5) 
of  update  formulas  produce  the  same  sequence  of  points  lx.1. 

Theorem  1 

Let  an  initial  point  x()  and  a symmetric  positive  definite  matrix  H lx?  qiven.  Suppose 
that  for  every  j,  is  the  optimal  step  size, 

si  " Vi'  xj*i  ■ *3  ‘ Vi 

and  is  determined  by  (2.5).  Any  choice  of  the  parameters  and  for  which 

' 0,  i.e.,  > 0 X°r  aXX  1>  results  in  the  same  sequence  of  points  i x ^ 1 . 

Proof : 

Suppose  that,  for  some  j,  all  matrices  H.  in  the  class  qenerated  by  the  update  formulas 
(2.5)  have  the  form 


(2.24) 


II  « to 


Vi  ♦ WidL  + „ 

i-i  xjqjPj  dj_1pJ.1  3-1 


where  only  m ^ ^ depends  on  the  particular  values  of  and  . Since  the  optimal  step  size 

is  used  it  follows  that  x^j  qlrl  are  ihdependent  of  Thus  spanlH^q^, 

is  independent  of  u' , ^ which  implies  that  p,  ^ = u , is  independent  of  to . ^ . Thus  we  can 
write 

p3p;  p1+ipi+i 

H,  " 4 7 J J ■ ~ + H. 

i i-1  XjqJPj  Xj+1q5  + 1Pjn  j 

where  A ^ + * is  independent  of  ^ and  the  matrix  H ^ is  as  defined  in  (2.10) 

and  independent  of  ui^  j . Therefore  (2.15)  becomes 


V pirlpi4l 

ii,,,  “ rh1  + <»,  t 4 h, 

1+1  djpj  1 ' jtlq j+lpj4l  1 


This  representation  of  Xs  equivalent  to  the  representation  (2.24)  of  H,.  Since  (2.24) 

holds  for  j « 1,  this  proves  the  theorem. 
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1 


In  practical  computation  <1  differs  from  the  optimal  step  size  and  numerical  <x[x‘fic’H< 


shows  that  the  efficiency  of  a variable  metric  method  depends  very  much  on  the  purticul.n  up 
date  formula  (2.5)  which  is  beinq  used.  Krom  (2.15)  we  obtain 


1 + 1 


- 


j + rj+1  Kj  d’Pj 


PiqHl  Uiqitl 

Pr  -irf-—  + K).u,  ±ri  . 


j j Vl 


Thus  depending  on  p^q^^  and  y^  , i.e.,  on  the  closeness  of  the  step  size  used  to  tlie  optimal 


stop  size  and  on  the  choice  of  and  Hj.  the  directions  sj+i  cttn  differ  consult.-!  ably . 
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t.  C'onvo!  JOUCO 

Pm  any  initial  point  x^  rot  whUh  Anmimpt  ion  l In  miti-flo.l  ami  any  aymmntrlo  po-uttvo 
.U'fuuto  m.itiix  M ( lot  Ix^l  l><>  a ■"•gunm-e  with  t ho  following  propwrtiwn 


It  P ( x ( 1 1 • I' ( x , 1 , l - 0,1,... 


tn  Vi  • *1 'Vi*  °i 


o,  1-0,1,... 


i i i > II  ^ is  iibtained  from  H.  by  (2.r0  with  arbitrary  parametois  I' ^ and 


Throuqhout  tho  tomaindor  of  the  papot  wr  shall  assume  that*  il  necessary*  the  paiamotois 
ami  it.,  are  adjusted  in  such  a way  that  11^^  is  defined  amt  positive  dot  ini  to,  i.e.,  that 
tho  conditions  of  l.emma  l ato  satisfied. 

It  is  tho  pur|H>so  of  this  section  to  show  that  it  Assumpt  ion  l is  satisfied  and  e ^ is 
chosen  appropr latoly , thon  tho  sequence  i *i  ^ i convorqos  to  y.oto  and  ovory  clustoi  pond  of  the 
sequence  i x , > is  a global  minimi  ?.er  of  Fix). 

Wo  shall  prove  this  result  by  qenet all  sing  a proof  duo  to  Powell  (141  foi  tho  case  of  tho 
BFGS  - method,  i.o.,  ii  « 1 , lv  - 0.  Powell's  proof  uses  the  inverse  of  It.  rathei  than  II  . 


Sett  i tvi 


wo  obtain  from  (2.0)  and  (2.10) 


‘iVi  Vi 

B,  - I J J I II 

t «i;p1  «;-i , I 


M . V li’/_U 

1 i-i^ifn 


Similarly,  (2.15)  Implli'n 


„ - , I Vi  , „ . 

' i»l  <!>,  „i , w\ u . I 


T I 11 
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As  .1  I 11st  step  we  derive  a i elation  between  the  tiace  of  B^  and  the  trace  of  B.  + ^.  By 
.let  mi  t i. 'n  the  trail-  of  H is  equal  to  the  sum  of  eigenvalues  of  H.  which  is  equal  to  the 
sum  lit  the  dia.jonal  elements  of  B ^ . Since  with  M.  the  mati  ix  B ^ is  jxisitive  definite,  too, 
the  trail'  of  B(  is  positive.  From  (1.1)  we  obtain 


tr(B  1 - — ♦ tr (B , ) 

1 qjp,  -ft  i 


Thus  Uhiiui  ll.2)  wo  have 


Pjqjl*  lldjt2  llw  ||2  ||w  ||* 

* 1 <H  )♦  l’  "“V  q'P^  * d*p}  W*q^  UP  -ft 


- t r ( B , ) 


PJUJI2  IldJI2 


l ^ ^ (i  - ~)  - -t — 

9?)  Yi  wiqi 


win  ri  the  last  equality  follows  fiom  the  definition  of  (see  (2.181)  and 

wi(  v vV  wiqi 

l ’.4)  w'u  - J J-  J-  . -J-  J- — 

' 1 HVVill  IIVWI 

m.  • »i  * p ^ n fot  all  I,  we  deduce  from  (1.1)  the  inequality 


1 1 (B  ,)  tr (B  ) 

j 1 1 0 


i KHa  | i >->t  iftir' 

‘ iMl  Wi  * i'.O  Vi  _WK 


Nt'xt  wo  establish  a telation  botwoon  the  determi nanta  of  and  B^ . For  tho  serial 

;•  't  tho  HFi.S  - method,  i.o.,  for  >.  ■ 1#  the  result  has  been  obtained  by  Pearson  1111. 


l.ot  an*t  H ^ ^ In*  defined  by  ().l)  and  (1.2),  x ospeot  ively . Then 


■1  / . ’j.  , qj  , ,!j± 

, VvJpj  ‘-Vi 


Jj  %pnj 


r 


, ~(S  *1  \ 

1M  W’l’  'ViV  ' ,l*ni*’ni  / 


Then  it  follows  from  ( 2 . l»)  and  (3.2)  that 


Therefore, 


hj  - o;V  *nd  * n3*iDj>i  • 


“•tnVlV  ’ d*UDjtlDjUDjlDi"l) 


2 


- (det (D]  ,D.  )>  - ^ — . 

1+1  1 YiVj 


which  because  of  B ^ “ H ^ implies 


Vi  i i Vi 

det<tr,.  - — H \ - - ■LrL  d«t(B.)  . 

jtl)  Yjpjqjpj  dettHj*  ■'l  1 


For  the  BFGS  - method,  . • 1.  Assuming  that 


(3.7)  for  some  6^  and  all  j 

Powell  (141  used  (3.5)  and  (3.6)  to  prove  that 

lim  inf||q,||  ■ 0 
j*<*  1 

A review  of  Powell's  proof  shows  that  It  can  be  adapted  for  a general  update  formula  of 
type  (2.5)  if  in  addition  to  (3.7)  we  have 


l->.  II  w. 


— 1 **  — , < 3 , for  some  <S.  ' 0 .i 

^ Vi  - 1 1 


1 < J for  some  S ' 1 and  all  1 

j — 2 2 


and  all  1 


16- 


Unfortunately,  it  does  not  seem  to  be  possible  to  determine  any  choice  of  the  i ir.imctiT 

and  Bj,  (other  than  Bj  » 1,  Bj  “ 0,  resulting  in  y^  « 1)  for  which  (1.(1  and  ( '.  •>  , .»n  h 

verified  a priori:  Indeed,  if  y^  > 1,  then  (3.8)  is  satisfied.  However,  by  Lemma  wo  h av 


WTVjVj  ‘ 0 

which  by  (2.21)  could  result  in  an  arbitrary  large  y^.  On  the  other  hand,  if  we  choose  Bj 

and  $2  such  that  y^  < 1 it  does  not  seem  to  be  possible  to  find  a oositivc  low.-^  bound  for 

y j • Thus  (l-y^l/y^  may  become  arbitrarily  large.  Even  if  these  numbers  arc  bounded 
2 

llw.  II  /wj  could  become  large  since  we  cannot  show  a priori  that  the  sequence  > a.)  is  bound- 
ed. 

In  order  to  overcome  this  difficulty  we  replace  the  matrix  by  a matrix  d (n^>  which 

is  defined  as  follows. 


P-iPa 


H.  (n. ) » h . + — J ^ «*— , n.  < 1 

3 3 3 1-n.j  pj^jPj  i 


Betting 


and  with  a modified  step  size 


we  obtain 


-1—  A?1  a ^ + H 
1-ni  Wl  Vi  j 


si " Ha<nj>«3 


J l-nj  “j 


Oj  «■  (l-n.)c. 


Vi  ' xj  ‘°jsi  * xj  '°jsj  • 


Furthermore  (2.15)  shows  that  is  n°t  affected  by  the  change  in  H^. 
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Denoting  the  inveise  imIiu  of  by  B.tn^  we  see  from  (3.1)  that 

i b <n.)  - -QlA  + .^Li  , n 

’ 1 J Vi  Vj  1 

Wi 

* ft  - n,  J . 

1 1 0jP1 

Using  the  same  argument  as  in  the  proof  of  Lemma  * it  is  easy  to  verify  that 

tiet(B(n^))  - (1-n  . )det  (H^) 

Therefo.e,  replacing  BJ  and  by  B < n , ) ami  B <n,  ) , respectively,  in  (3.3) 


an>1 


l ' . »'  1 we  obt  a 1 n 


(3.11) 


tr,Vi"Vi"  ’ trlV’ 


(1-n . ) o , ||g . || 2 ||d.  || 

) ) - J J 4 J 

i Spi  d’p' 


ipi 


- ( l-  — > 


Vi  l|qjti 1 


Vj  Vj  1 + 1 «j*lPj*l 


ami 

<3. 12) 


,ct<Vi(Vi”  * “7^  (T-njfpj<,'jPj  det(Bi 


(n^)  . 


t hat 


If  we  assume  that  ( is  the  optimal  step  size,  then  it  follows  from  (2.22)  and  (2.23) 


X , 


wi  ■ ViVr  pjti  ■ v pj+i*  irf 


which  by  (2.1)3)  ami  (3.4)  implies 


wvi1  i y 

♦ iv - 1 Tj  Vi 


Thus  if  we  ret 


Vi  ' 


then  flu-  sum  of  the  last  two  teims  in  13.11)  is  zero  and 


1-n 


J.\> 

) 


1 . 
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Since  it  suffices  to  have  the  sum  of  the  last  two  terms  in  (3.11)  bounded  from  above,  o. 

3 

need  only  be  an  approximation  to  the  ootimal  step  size  which  satisfies  the  following  condition. 
Condition  1 

The  step  size  cr  is  determined  such  that,  for  all  j, 


(3.  13) 


(1-1  .) 


Vi' 


(gj+i'tjdj)  Hj+i(9j+i_t jdj’  gj+iHj+igj+i 


1*3  * 


where  is  an  arbitrary  positive  constant  and 


3 d;Pj  • 

For  Y..  = 1 Condition  1 is  trivially  satisfied.  If  o_.  is  the  optimal  step  size,  then 
' 0.  Therefore,  for  every  j,  there  is  an  interval,  containing  the  optimal  step  size,  such 
chat  every  in  this  interval  satisfies  Condition  1. 

Since  + 1 “ ejdj*  € span(q^,p^t  and  djHj+i*g-)+i  " E^dJ  = 0 it  follows  that 


(3. 14) 


u.  - -S3tL-Cip3 
3 llsJ.,-eJp 


, 9j+l  1 idi 

and  w.  =o).  * i 


j*rejpjll  3 3 llsj+re3pjl 


Thus 


(g 


^ll2  i ii^n2 


,j+i"cjdj),Hj*i<9j+reidj)  “jwjuj  wjqj  ' 


and  observing  that  p_.+1  = sj+1/ll  sj+1  II  = pj  + isj+i  we  deduce  from  (3.13)  the  inequality 


Choosing 


1'Yi  Pi+J|gi+J2 

1 — f (1-Yj  -3—r 3_i < «,  . 

Y3  W3qj  3 9j+lPj+l 


Vi  = '-'i 


and  assuminq  that  the  inequalities  (3.7)  and  (3.13)  are  satisfied  we  obtain  from  (3.11)  the  re- 

lat ion 
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(i-'i.)p  , l|q,ll 

tr<Vi(Vin  -tr  (Bi<nj”  ‘ — V;,'Y  4 *o  * S 


which  shows  that , for  every  i , 


( K 15) 


i (l-ni)p,||qil| 

tr  Vi(Vi’  - tr(W‘  " •• 4 (lllllV43) 

l«0  X X 


where  B^l  - BQ. 

Using  this  uppei  bound  for  the  trace  of  l<  j * F ^ \ ' we  can  prove  the  followin'!  key  lemma. 

Lemma  4 

Suppose  the  inequalities  (3.7)  and  (3.13)  are  satisfied.  Then  there  is  3 • 0 such  that 

for  j - 0,1,..., 


(3.16) 


i II  II  1 i dtp 

n ~sk~  - 41  n 5?r 

i-0  Mif i i-0  i i 


Proof : 


Since  is  positive  definite  for  all  },  we  obtain  from  (3.15) 


f (l-ni)Pil|qi|| 

~ <i!p. 

i»0  i'  i 


< tr(B0)  * (j  + l)(«  3-S^  < (1  + 1)3,. 


with  3^  « tr  (h())  + 3fl  + 3 ( . Applyinq  the  qeometric/arithmetio  mean  inequality  wo  obtain  th>' 
relation 


(3.17) 


1 U-VpJlqJI  1 + 1 
n — for  J " tM 


i-0 


Observing  that  l-q^+^  » y.  and  usinq  (3.12)  wo  find 


(3.18) 


1 *',P,  doMH.^Ql^)) 


l l 


i!0 '(1-ni)(,iqipi  dottv 


Next  we  deduce  from  (3.15)  the  inequality 
(3.19)  tr(Vl(Vin  - 


-.'0- 


Since  the  detciminant  of  is  equal  to  the  product  of  its  eigenval u 

(3.19)  and  the  goometr ie/arithmetic  mean  inequality  to  find  the  relation 


det(Bj+i<njn)) 


Combining  (3.17)  , (3.18)  and  the  above  inequality  we  obtain  the  expression 


g ! p . — 5 

1*0 


ij+1  ~ 


(1*1)  6, 


— !_  J Hi 

""•o’  i.o  •‘il’i 


< f]  n — - 

— 4 dp. 

i”0  i^r 


where  6 is  a suitable  constant. 

4 

Since  the  inequality  (3.13)  is  trivially  satisfied  if  = 1,  i.e.  in  tin  ' ,s  - method, 

we  need  an  additional  condition  for  the  step  size  o,  in  order  to  be  able  to  dr  iw  *‘ur*-he»  con- 

3 


elusions  from  the  inequality  (3.16). 


rendition  2. 


Let  > and  >*  be  constants  satisfying  the  inequalities 


0 < V < Y*  < 1,  Y < r- 


and  let  be  determined  such  that 

3 


“ q;>ipi  - Y*qipj 


ii)  F(x  ) < F ( x ) - > 1 1 o s Jlgjp.  or  o " o*  and  F(x.  ) F(x.  - 
> ♦ 1 i 3333  j ” j 14-1-  3 


(V  is  the  smallest  positive  number  with 


F ( x ^ - o.s.)  * F(x^)  - > • Hu’p. 


A 


m) 


l j 


* if  (>ossible  with 


1 if  S , - 0 


q’s 

XJ i, 

2(F(x  -* ,)  - Fix.)  » «i  a.) 

) l ) 11 


* 0 


Lot  o denote  tho  optimal  stop  size.  Since  a could  t>o  qreatei  than  a,  and 
i i i 

Pond 1 1 1 on  1 could  force  - to  txi  closo  to  o^  wo  cannot  insist  on  tlio  inoquality 

F,V,'  -F(V  - ^Yi  ‘^Y 

Under  suitable  assumptions  it  can  be  shown  IIS)  that  with 


o « 

3 


q ' s 


2(F(x  -s  ) - F ( x . ) * q'.s.) 
j 1 3 13 


have 

(3.20) 


|VF(xj-oj*s.)1pj|  = 0(||q.||2) 


Furthermore,  it  will  be  shown  in  tho  next  section  that  for  every  update  formula  u’.r>)  with 
'jt*-,  2.  iV  “ o * is  an  acceptable  step  size  for  j sufficiently  larqe. 

Dsinq  a stop  size  which  satisfies  Conditions  1 and  2 wo  obtain  tho  followinq  result. 
Lenina  5 

Suppose  the  inequality  ().7)  holds  and  o.  satisfies  tho  Conditions  1 and  Then 

lim  inf  ||  q ^ 1 1 - 0 . 

Proof : 

Since  for  all  i, 


. l.Lsi-lhiPl  "Vi11  s 

^Pj  '-1*  ‘ 6 


■iprqjMpj 


where  t is  a suitable  constant,  it  follows  from  (3.  In)  that 


(3.21) 


j llq.ir  . 

— r — i1,  1 - 0,1 

i„o  qipi  4 (' 


4 


s . 

0 


If  there  is  an  infinite  subset 


The  sequence  (Fix  ' is  decreasinq.  Therefore,  (x.'  • S . If  there  is  an  infinite  subset 
1 1 0 

J (0,1,...)  and  an  i s 0 such  that 

1.  c for  3 c J i 

then  it  follows  from  q ^ 1 3*  p'.q^  and  th*  uniform  continuity  of  VF(x)  on  SQ  that 

mini  !| o ^s  . j , |lc*s^!|  >_  t ^ > 0 for  some  > 0 and  j t J 

Because  Fix*  is  bounded  from  below  and 

Fix.  ) < Fix.)  - yo'.q.  min{  ||  o . s . ||  , ||o+s.||) 

3 + 1 - 1 '33  3 3 3 3 

ttiis  implies  that  Pjdj  * 0 as  j -»  <■>,  which  by  (3.21)  proves  that  [||g  ||)  is  not  bounded 
away  f i om  ze  t o . 

We  ire  now  ready  to  prove  the  main  converqence  theorem. 


"heel  on  2 


it  Y - emotion  1 and  Conditions  1 and  2 be  satisfied.  Then 


q . * 0 as  j -»  •» 

and  every  .lister  point  of  the  sequence  {x.t  is  a qlobal  minimize!  of  Fix). 

3 


It  ha-  leen  shown  in  [141  that  if  Fix)  is  convex  and  twice  continuously  differentiable 

on  fY,  ta  n the  inequality  (3.7)  holds  for  all  j.  Therefore,  we  deduce  from  Lemma  5 that 

the! e i;  an  infinite  subset  J [0,1,...)  and  a z . S such  that 

0 


VF(z)  =0  and  x^  > z as  j 


*,  j t J 


If  (q  ■ does  not  converqe  to  zero,  then  the  sequence  i x . > has  a cluster  point  z* , say, 
uch  tiiat  VFiz*)  ¥ 0.  Since  Fix)  is  convex  this  implies  F(z»)  ->  F(z)  . Because 

Fix.),  this  contradiction  shows  that  g ► P as  i * «>.  Therefore,  it  follows  fron 
the  continuity  of  'F(x)  and  the  convexitv  of  Fix)  on  S(1  that  every  cluster  point  of  (x.; 
is  a qlobal  minimize!  of  Fix). 


4. 


Super  linear  convergence 

In  order  to  prove  that  the  sequence  {xJ  converges  superlinearly  to  a global  minimizer 
of  F(x)  we  require  that  in  addition  to  the  assumptions  stated  in  the  previous  sections  the 
following  assumption  is  satisfied. 

Assumption  2. 

The  sequence  {x^}  converges  to  a point  z.  The  Hessian  matrix  G = G(z)  is  positive 
definite.  There  is  a neighborhood  (^(z)  such  that  the  Lipschitz  condition 

|| G(x)  - G(z)  ||  <_  Lj|x-z|| 

holds  for  all  x c U^(z) , where  L is  a constant. 

The  above  assumption  implies  that  there  are  a neighborhood  (z)  and  constants  0 < u < n 
such  that,  for  every  x € U2<z), 


(4.2) 


W II  y i 


y 'G(x)  y £ n I!  y | 


for  all  y e E 


Therefore  there  is  a neighborhood  U(z)  such  that  the  inequalities  (4.1)  and  (4.2)  hold  for 
every  x e U(z) . By  deleting  finitely  many  members  of  the  sequence  {x^}  if  necessary,  we  may 
assume  without  loss  of  generality  that  (xj  c U(z)  . 

In  proving  that  the  sequence  {xj  converges  superlinearly  we  will  use  the  weighted 
matrices 


G1/2H.(n.)G1/2 
1 3 


G-1/2B.(g.)G-1/2 
3 3 


1/2 

where  the  symmetric  positive  definite  matrix  G is  the  square  root  of  G and 
-1/2  1/2  -1 

G = (G  ) . As  a first  result  we  will  show  that 


= tr(G1/2H 


)‘Vcl/2> 


+ tr(G"1/2B.(n.)G  1/2) 


is  bounded  if  we  choose  n ^ as  before  and  impose  an  appropriate  condition  on  the  step 

size  a . . 

3 

We  observe  that  by  (3.10)  and  (2.15) 
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(4.1) 


H.  . In. ..)  * H.tn.)  - -- - . - J - t-  J 

)'i  }*1  ) t l-n.  i'  >i  ”,  il  i-  w ii 

1 lit  it  t 1 1 


u.u’  n , ” , 

JL  J ♦ J.*l  l'J 

1 w'u . l-n , , . , j • , ” 

11  1*1  1*1  i*l  1*1 


Thi'i I'foio,  choosino  n l-i  . , and  sett  l no 
l j-1 


Vi 


, - 1 , 1 - 1 

, W ' Cl  w . W 0 W . n o G o 

v-  , — -h — i - -i;.-  1 - „ I'l.  j* l . itA 

1 Mj  vj  v, 


w G ' w 


- <l-V 


l i Vi 


>,  w.q. 


q1+lpi*  l 


1 w ' •»  ...  * ..  * 


Vj 


Vi 


u'Gu,  <l',Gii  n , Go 

; « in  J,.  J - 4-J  ♦ liii  j,;  j*1  _ „ 

' ' wiui  Vj  '-'Vi  i",.  1 * 


(I  .-1) 


Vj  >t  " i*r';*i’'i.i 


wo  doduii'  from  (4.1)  and  (1.11)  that  for  ovory  i 

|’’,G)' , *(  l-n  . ) ‘ V,  n ' G ‘q  . 


(4.4) 


. rr  + , , . 

' * ' i ( l-n  . ) i-  .o '.p  ^ ' i v i 


♦ f.  . ♦ l 
1 1 


In  or. l.  i to  show  that  the  sequence  ) is  hounded  we  have  to  derive  m 


trims  i 


^ • This  will  be  done  in  the  following  few  lemmas. 


f.cmin.i  o 

l.et 


U 


'.wind-. 


; be  \ symroetiir  nonsinqular  (n,n)  matrix  and  let 


v > x . l 


t >M* 


x'wxty’i;  'y  v'O  'v 

V'x  * y'x 


where  v » y - c.x. 
Proof : 


x'llxty  'v:  'y  _ x’ly-v)  ty  • (xi,;  'v) 
y’x  y'x 


2 4 y-*.<9~  y-x.)  . ? vvr’v 

Y'x  • y’x 


l.cnma  7 


Under  the  assumptions  stated  the  sequence 


is  bounded  and  the  sum 
(4.5) 


""jr!  ~ 

llx3  - *lf 


).  Ilx.  - z || 


1-0 


is  finite. 

Proof  : 

By  Taylor's  theorem  there  is  a Vj  on  the  line  seqment  joininq  z 


Therefore, 

(4.6) 

which  implies 


2(F(x^) -F(e) ) - (x^-b) 'G(v^) (x  -b) 


V<  ||  X -z  ||  2 y 2 ( F ( x ) — F ( z ) ) < q II  x ,-x 


l|xjti  - n n 

||x1  - b||2  ■ " F<x'1T-F(rl  -u  ■ 

By  Taylor’s  theorem  and  Condition  2 we  have 

V*qiPj  - StlPj  - «»}PJ  ' 


-76- 


•Tiki 


k(x.)  - yo^jjsj  - K(xt-n*s.)  r(x^)  - u^g’s^  + j nllo^Sjir 

Thot  I'l  orn , 

(4.7)  mint  || o ||  , II  ) jl  ~ min(  1-Y*  . 2 ( 1— Y ) 

qjP1 

« ( 1 — y * ) 

n 

lining  Condition  7 onon  more  wo  deduce  from  (4.6)  and  (4.7)  the  relation 

(4.6)  r(x|M>  - F(z)  ^F(x^)  - F(z)  - Yd^  mint  ||  o ||  , 1 1 o * s ^ | 

< Fix  ) - F(z)  - X^-~-H-*-l  (q'p  )2 

~ J n Jr  j 


( y(1-y*>  2||q1112  (qiPi)2 

v (Fix.)  - F (z)  ) ( l - ’ J 1-  - J-  -J--- 

' \ n m||x.-z||2  llq^l2 


( 2»2  (qiPir^ 

( F ( X ) - F ( z ) ) I 1 - > ( 1-Y*)  r-  J J 

J ^ 62  ||qj1|2; 


whet  >-*  t ho  last  i noquality  follows  from  tlie  relation  vi||x^-z||  £ ||i7|||»  see  112]  for  instance. 


C.  I - y ( 1 - y * ) 


..2  (q'V  )■ 


n2  lid,  ||  2 


wo  obtain  front  (4,?t) 


- F(»)  < (F(x0)  - F<*»  n l{  . 


Sini't'  >l!p  ‘ l*  if  folb'w:  from  (l.lb)  that  there  is  «■  1 such  that 


A in,  ii  ' ’ - 1 


Obsotvinu  that  1 11  • II  we  deduce  from  this  inequality  that  f«>t  »*v«»iy  i,  it  least  halt 


of  the  numbers 
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i - 0, 1 , . . . , j 


are  qreater  than  or  equal  to  This  implies  that,  for  every  ,t  least  half  of 

_ 0#1# •••#}*  are  less  than 

that 


munhe » 

■> 

or  equal  to  some  tS*-  < l Thornfnm  i*  « , . 

H 1 ineiorore,  it  follows  fiom  (4.*) 


FUjM>  - P(*)  < «J(P(*o)  - P(«)1  , i - 0,1 


which  by  (4.6)  implies  that  the  sumo  (4.  ' is  finite. 
Lemma  8 

The  assumptions  stated  imply  that 

11  Hdj  " GP  j II  - 0 ( ||  x . - z || ) 

}.  (L  - 2)  is  finite. 
j=0  3 


Proof : 


By  Taylor's  theorem 


(4.10) 


where 


Hence 

(4.11) 


j ~ liv^r = goj  + Ejpi 


E1  ' / G<x;  ♦ t(x  - x.))dt  - 0 . 

J 0 J + 1 J 


l|EjH  1 maX  l|C(*a  ♦ t(X  - X.))  - 0|| 

°<t<l  ’ 11  1 

i max  ||  I.  ( x ♦ t ( x - x.)  - z ) || 

0-t.^l  J T * 1 

1 !•  maxl  II  - *||.  Il*j  + 1 - *||)  - 0 ( ||  x . - <| 
where  the  last  relation  follows  from  Lon™  7.  Osinq  the  inequality  d' 


have  therefore 
(4.12) 


^P.  2.  11  and  Lemma  <• 


0 1 Tj  - 2 - 0(  ||x1  - z||2)  , 
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L 


^ * „ • ^ . 


which  by  l.omma  7 implies  that  the  s'im 


) <S  - 2> 
1-0  1 

is  f ini  to. 

1 .etnna  '• 


Tin-  assumptions  stated  imply  that 


Proof : 

By  definition  a. 


"diVdipr 


Therefore,  usino  (4.101  wi>  have 


“1  ' <2  niPitk,j  4 ',1PjGP1) 


i aiqi  /dl‘sY 

-i  . /_  a J—l  /.«•  - .op  i.,  , i.  J „•  (d  - p 

•fa  ‘ <**Pj  WJ  P1Ej,q1  ^d'pj  1 1M1 


iV’ 


-A-  <-  '.A.'1-  * • pjVl  * (4^)  riK 


w >1 


ri 


d;pj 


i«v 


H^ll  MMjll  t IMjir 

■ wVi  * (djp j): 


or  i - 

w1qt 


n,  dip.  ' »i. 


In  ordei  to  find  an  upper  bound  for  the  terms  and  i.  ^ we  observe  i 

the  definition  of  w(,  - L ^ * 0 if  ( is  the  optimal  step  size.  Thus  w. 

<f  • I.  by  imposino  a condition  on  0 ^ which  ensures  that  o^  is  nufficien 
optimal  step  size. 

Pond  i t ton  ' . 

The  step  size  e is  determined  such  that  for  all  1 


liy  ( l . *J  ' . Mi 

\ t'ont » Ol 
l 's»*  i.'  the 


-M- 


li->j«.<7 / ,n<4v 

1 | (qjU'EjV  Vl'VrVl’  l(qj+l”Cjdj>  'Hj+l<qJ  + l“Cjdi>/  /- 


aMl^ilM^I’ 


•-■jldl-uilKUJ) 


Wr'iV'Vi^rW'  \(ViVj,,Vi(Vr'YV]  f 1 v 


£ . S.l'j 

j d;pi  ’ 


is  a positive  constant  and  I ) is  a seouence  of  positive  numbers  such  that 


is  finite  and  either  a.  - o*  or 

j j 


'juPji  i iVF(xr°j*sj)'Dj|- 


Condition  3 is  trivially  satisfied  if  y » 1,  i.e.,  for  the  BFGS  - method.  If  0 is 

J 3 

the  optimal  step  size  then  .• ^ - 0.  For  every  j,  there  is  therefore  an  interval,  containing 
the  optimal  step  size,  such  that  every  o.  in  this  interval  satisfies  Condition  3. 


Since  by  Lemma  7 the  sum 


I II*.,  * *11 

j=0  3 

is  finite  and  ll?jll  = 0 ( II  x-j—  zll*  <see  U2),  for  instance)  it  is  possible  to  choose 

vj  * Ihjll  1 - 0,1 

rn  the  next  lemma  it  is  shown  that  Condition  3 imnlies  Condition  1. 


If  Pj  satisfies  Condition  3,  then 
i)  0^  satisfies  Condition  1 


ii)  £ ( ♦ C.)  is  finite. 

j*0  3 3 


Proof : 

that 

(4.13) 


‘Vr'jW^vr*  idi'  ■ qjMHj*iqj«i ' 'jdjpj  - qjn*Viqj*i 


wo  obtain  the  relation 


i4. 14) 


■ I <gt-»rciV  ,<rl(qn-rSV  _ 


U-yJ 


1 1 lt9j*i"‘ idi’  Hj+i<9jti"£j<V ,qj+iHj+iqj+i 


|c5djG-1ar2Fjdjcrlgj+xl 


(q.i+r'  iVVi<9i+l‘*lV  J 


o«Vj’ 


whore  the  equality  follows  from  (4.13)  , Condition  i and  tho  fact  that  ||i  ^d,||  * 0(|g*4^p^l). 

Replacing  cl  * with  the  unit  matrix  wo  deduce  from  (4.14)  that  c ^ satisfies  Condition  \, 
Mv'itov't  ; nee  it  follows  from  (*.?4*  that 

L vxl'*l  . (<W  idi),G"1(qin-  iV 

h wiqi  (9iM-,idj’,"ju(9j4i-fidJ) 
wo  obtain  i t mu  (4.14)  tho  inequality 

>q-lM  ^i>slovj  • 1 ' 

fot  some  const  ant  Since  it  fol  lows  from  (2.16)  , (3.4),  and  (3.14)  that 

. > (siicr±piV:^-ctjL1^ 


w)’j 


hi-  have  similar  to  (4. 14'  the  inequality 


h 'vrW’V'Vr'yV 


I 1 .16) 


r v — J — 
1 y) 


|q,.r'  iW'Vi*' 


..'•i'iri-r/j-V' 

vwvv  iv 


1 1 i 


n- 


for  some  constant  ^ j j • 

(fsirui  Condition  i we  will  now  establ  ish  the  boundedness  of  the  saqinuice  , ^ ini  tw> 
important  consequences . 

I^envna  1 1 

Let  Assumptions  l and  2 be  satisfied  ami  suppose  that  the  step  size  a_.  >.ut  ist  u omli- 

t ions  2 anti  3.  Set  n.  » 1 - v.  . . Then 
j j-1 

i)  The  sequence  {^.)  is  bounded, 

ii)  The  sequences  { ||h  , (n^)  ||  ) and  ) ||n.  are  bounded, 

iii)  ||  tl-njp^q^  - Gp  ||  «0  as  j * ■». 


Proof • 


1)  By  (4.4)  and  l.ommu  b we  have  for  every  ■) 


(4.17) 


Vi  - V2  * V V S ‘ ui 


(4.1«) 


- *J  4 *iallxrz|1  * ,sio4Sn)vj  4 sn  w’qj 

where  and  6^  are  positive  constants  and  the  last  inequality  fol  low  turn 

(4.12),  14.15)  # (4.1b),  and  I.emnui  (i,  Because  foi  every  i 

I', 


il1 , 21  1 and 


1 


Wjqj  “ " u 


-1-  < -1 


we  obtain  from  (4.17)  the  relation 

6 


Vi  - V1  4 (*i2 4 -r’liv*11  4 (,io  + Aii’V 


where  <5  - & ♦ — and  6lc  - $ ♦ & Therefore, 

14  12  u 15  10  11 


*in  ' **n  ^ v'n(l  ♦ ^ ||x  -*  ||  ♦ fl%cv.).  Since  by  Lemma  7 and  Condition  ' 

)♦!  - 0 14"  l 15  i * 


the  two  sums 


I l|x.-z  I!  and  } V 
j-0  J j-0  J 


are  finite  this  shows  that  {li^}  is  bounded. 


x i > Because  H,(n^)  and  H.  (n.)  are  positive  definite  for  every  j me. 
to  the  sum  of  the  eigenvalues  of  H^(n^)  and  H.  (n.)  the  second  • •• 
theorem  follows  from  the  boundedness  of  {||<  1, 


i -n*  it  t hi 


iiil  By  (4.4)  we  have  for  every  j 


f fw'-vM";0'1’! . \ , , j 

1.0  V '‘‘WIP  I - 0 tic 


(t  - 2 ♦ V.  + t . 4 . ) 

j 5 3 1 


Since  by  lemmas  7 through  10,  inequality  (4.18)  and  part  i)  of  the  tin 


I (t  - 2 ♦ * ♦ i ♦ Hj) 
j-0  ‘ 333 


the  inequality  (4.18)  implies  that 


(AjvMl-OV^r'a 

" “-vwi 


Hy  the  spt'onti  part  of  thp  theoi pro 


2 as  j 


(l~'V,  )qjpi  " (1~''j’  °lqjHj<nj,qi 


is  bounded  away  from  zero.  Therefore  it  follows  from  (4..1)  and  1 ore 


(l’nJ>r1qi  " ^V1  * 0 **  1 


Bctoie  wo  can  use'  t hi'  above  results  to  prove  the  supetlineat  conveioenc, 

( * ( ' to  * wo  need  some  properties  of  the  two  sequences  ( > ’■  and  (,*'.  . 

I i shed  in  the  following  two  lemmas. 


let  Assumptions  1 and  2 and  Conditions  2 and  < be  satisfied.  Then  toi 
(.’.*>1  with  ♦ 8 , t 0 the  following  statements  hold. 


» 


i) 

If 

B 

12 

2 0 then  * 1 as 

i * »*, 

ii) 

If 

V2 

0 , t hen 

) . * 1 or 

*1 

el  as 

iii) 

If 

'j  * 

1 as  j -»  •»,  then 

U-> 


/(l«4+1ll\2 


Proof ; 

Since 

(4.22) 


d’q.  - p'OcU  ♦ (d,  - Gp.)  *q 


3’J  rj  ’)  '“j  v,pj' 

“ * (Gf>j  - ‘l-yPjVMj  ♦ Cdj  - dp.l'q. 

- (Go.  - (l-nj)Piqj)*qj  * (d.  - Gp.)^ 
if  follows  from  Lemmas  S and  11  that 
(4.23) 


d^q1  • 0 as  j 


Let  s 0.  Because 


|8ldjPj  * e2djHjdjl  - I^JP,  2 l»xlu  ' 0 

and.  by  part  ii)  of  Lemma  11.  w^  is  bounded  away  from  zero  it  follows  from  (2. 
(4.23)  that  * 1 as  j • Now  assume  that  Sj  « 0.  By  (2.20)  and  (2.17) 


(4.24) 


1 k'jqjPi  ‘'iq4p(  <dX>‘ 

; d h i - i ♦ — u_ 

J <d’jpj’ 2 ’ j mjp/  Vj 


f q;p, 


(d’ry'w’q.  / IIs 

u;p.  (d!q.); 


Jl 


*( 


« , . , im i laiV  Vj_ 

,6\llqj  II  (d’pi)2w'q1  /'*  1-1 


>4- 


21)  and 


some  positive  constant  V , where  the  inequality  follows  from  the  relation 

It 


Ji. 


H . ( r|  ) q . 

j 5 ’ 1-ni  Pj  'j-1 


which  by  part  ii)  ot  lemma  11  implies 


IhJ 


0(- ) . 

j"  h-i 


sinc»  by  lemma  2,  1 > . 1 for  j * 0,1,2,...  we  deduce  from  14.23)  and  (4.24'  that  ♦ 1 


Finally  let  t'.t'.,  v 0.  By  (2.17)  we  have 
l i 


s4  . .’51 


/ dX  <d]V 

i.d’.p  * 6,d'.H.d.  - d'.p.  S,  » 6,  - ■ J J ♦ tt,  —.-2-2 

1 j 1 2 11]  r A 1 2 O.q'p.  2 wjq.djp. 


Pm  t hoi  more  , since 

■I'p  p'Gp,  ♦ (d  - Gp.l'p.,  1 .p.q'p  - p .Gp  * (1  ,p.q.  - Gp.l'p 

n i i i i V )-l  ri  ) i i i-i  j ) 1 1 * i 


it  fi'l  s from  lemmas  H and  11  that 


l4 . 26' 


dip,  dip. 

l , . -J-2.  ->  (l  + t ),  i - 0 as;  i 

q ,p  l-i  1 1 .c  q !p  ]-i  1 1 

i J 1 1-1  1 i 1 


■ i4.25)  , and  i4..'n)  we  see  that  there  is  > - o and  i such  that  l - > 

0 1-1 

and  1 - i imply 


.i'p 


ViV'^Vj1  - V'h  ♦ sai  ' 0 


'!  • , it  follows  f i om  (2.21'  and  (4.211  that  the  sequence  *.  1 - % ' either  convetqes  to 

<■>  bounded  away  tt.m  zero.  In  the  latter  case  (.'.21'  and  (4..’ 11  imi-ly  that 

jC"r  ••  • 0 as  1 • -*  which  by  (4.251  and  (4.261  shows  that  t^  ♦ t*  ,1  ^ j * 0 as 


P 


(4.2 


, d-.p  ! )*\vj*i** 

\ Y ) . 3 ' > 


,i  i j i<yp2'  ' 0 


, . since  by  definition 

for  j sufficiently  UK*-  ' 

- I rrTnr  ||°j*jl 


Stfii  M 


\ 3 


v>  . S , 1 

i r 


V 


3 3 ' 


an»i , untie 


M/||„  , |h  is  bounded  it  follow 

,,  Assumption  2.  1!1VI'I>°1S*" 


s from  (2.21)  and  (4.: 


: x _ > , ! • o<min( 


f'lliii'Y,  (aw  )2 
\H^Tl  I j 1 


‘D  . 


la'mma 


1 ) 


l.et  Aasumyt  ions 


l and  2 and  Conditions 


, » be  .«!.««>•  *-  <or  ”v"v 


12.*)  with  ♦ $ 2 


* 0 and  f ot 


3 sufficiently 


VF  ( x 


Fix 


_ ,,«s  ) • Fix.)  * i 1 0*si^oi!  j 

1 J j - 3 


proof : 

First  assume 
in  the  set 

(4.26) 
such  that 


» that  6,-0.  i-1'-  ^ 


, i hv  Taylor’s  theorem 
1 . Then  o ^ 1 ' 


. . o < t *•  1' 

U I « ■ x)  ‘ tb)'  - ' 


VF 


Fix  - 


*’  .o.)PA' 


*v,  S7  *s»  - ' " ’ 

3 » 


0 as  5 ** 


this  'mV 


H s ||  . o<quy  and  "Gpj  ‘ ''fy 

w 1 ; u™..  •*  - 


VTI«,  - •,<>, 


(4.26)  such  that 


(4.29) 


'.iV,)F, 


rU)  . - r..,.  - = 'C'-’  i 


- 36- 


?)  that 


jate  Cormul* 


there  i*  v 


\ i oi;  i hat 
i»  tha 


S i nee 


P,jG(yj)Pj  = e^Pj  + (GP.  - + P^(G(y.)  - Op 


-3" 


i)  The  sequences  {||tL||l  and  ( || H ^ ||)  are  bounded. 


"Vr81'  . 

0 as  3 - ipr^ir  " 0 as  3 


iii)  The  two  sums 


• I n1+1h\  " f»xj+rz| 

i wr)  and  io  >JiW 


are  finite. 

iv)  If  61B2  >_  0 or  B1B2  < 0 and  y ■*  1 as  j -*•  then 


a -*■  1 as  and  a . -*■  I as  j 

I 3 

where  CK  denotes  the  optimal  step  size. 


v)  If  8, 6_  < 0 and  y.+-  — as  j -*■  “ then 

12  3 p2 

S2  - 62 
o . - — as  j -»  “ and  o . •*  - t—  as  ] 

fl  3 S. 


3 8n 


vi)  If  81B2  >_  0 or  Sie2  < 0 and  y + 1 as  j • then 


a . = a* 
3 3 


for  j sufficiently  large,  provided  |( g ^ (|  = O(v^) 


The  first  statement  of  the  theorem  follows  immediately  from  part  ii)  of  Lemma  11  and  parts 
i)  and  ii)  of  Lemma  12. 

Since  1-n . = y.  . it  follows  from  (4.19)  and  (4.20)  that  the  sum 
3 3-1 

- (’■yyyyfo'S 

j«0  \ »j-l  I 

is  finite.  By  Lemmas  6 and  11  this  implies 


i4.  M) 


- CP,  I 


Fu*  t hotmoi  o it  follows  from  (4.  It')  and  (4.11)  that 


(4.  M) 


rliill.  hA-  -c£4|U  He  I 

ifpjlt  _ llC|jl!  iPp"  11  v 


First  assume  that  i).,  » 0,  i.e.,  3 . - 1,  for  j ■ 0,1,2,...  . Then  homma  13  implies  that 


(4.  1 1) 


o^-o*  = 1 for  i sufficiently  large 


|pjqj  * A1 


IhJI  ^4  S4 

Ar  IIttA  - G rrA 


lsjll  "IIP, I 


and,  by  the  first  part  of  the  theorem,  ( || q || / 1[ ||  ) is  bounded  we  deduce  from  (4.5),  (4.31) 


(4  . 12)  , and  (4.33)  that 


|4  . <4) 


0 as  j k » and  / 


* 


Now  suppose  that  tl  ¥ 0.  Then  it  follows  fiom  Condition  3 that  eithei 


■ " u*  or  I u ' p.l  v |v'F(x.  - ojsl'p.  | 

1 1 1 1 + 1 j 1 1 ) 1 j 1 


Ry  ( 1 ’01  \ his  unpl  ios 


(4  . 1M 


V\.  • t h*"  ’.not  ■ , ho.  .ms*'  <■-•.•1*  | 1 | fo»  instance) 

(4.  hi  II  dj  II  ” 0 ( 1 1 x ^ — z 1 1 ) and  ||x^-r.||  -0(||q^| 

w«  l«d  * rom  t4.<‘*>  and  lemma  / that 


(*♦.  : 


v “’Jii’i.1  . 


iwi  ll‘*, I 


>\  4.  mi  , and  I «*mma*.  and  M w»*  have 
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Finally  we  deduce  from  Condition  3 and  Taylor's  theorem  the  inequalities 
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Therefore  the  parts  iv)  and  v)  of  the  theorem  are  a consequence  of  Lemmas  12,  13  in  (4.41) 
through  (4.43). 

To  complete  the  proof  of  the  theorem  we  observe  that  in  view  of  Lemma  It  il  •fficnit  * 
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of  Condition  3 for  j sufficiently  large. 
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is  bounded  away  from  zero.  By  the  first  part  of  the  theorem  this  implies  that  t ti  -ciu-onc 
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is  1 un.led.  Therefore  we  obtain  from  part  iii)  of  Lemma  12  the  relation 
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large'.  A completely  anoloquous  argument  shows  that  the  second  inequality  is  : •« 

l if  sufficiently  large. 
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